Repairing inconsistent dimensions in data warehouses
نویسندگان
چکیده
A dimension in a Data Warehouse (DW) is a set of elements connected by a hierarchical relationship. The elements are used to view summaries of data at different levels of abstraction. In order to support an efficient processing of such summaries, a dimension is usually required to satisfy different classes of integrity constraints. In scenarios where the constraints properly capture the semantics of the DW data, but they are not satisfied by the dimension, it arises the problem of repairing (correcting) the dimension. In this paper, we study the problem of repairing a dimension in the context of two main classes of integrity constraints: strictness and covering constraints. We introduce the notion of minimal repair of a dimension: a new dimension that is consistent with respect to the set of integrity constraints, which is obtained by applying a minimal number of updates to the original dimension. We study the complexity of obtaining minimal repairs, and show how they can be characterized using Datalog programs with weak constraints under the stable model semantics.
منابع مشابه
Efficient Algorithms for Repairing Inconsistent Dimensions in Data Warehouses
Dimensions in Data Warehouses (DWs) are usually modeled as a hierarchical set of categories called the dimension schema. To guarantee summarizability, this is, the capability of using pre-computed answers at lower levels to compute answers at higher levels, a dimension is required to be strict and covering, meaning that every element of the dimension must be connected to a unique ancestor in ea...
متن کاملLogic Programs for Repairing Inconsistent Dimensions in Data Warehouses
A Data Warehouse (DW) is a data repository that integrates data from multiple sources and organizes the data according to a set of data structures called dimensions. Each dimension provides a perspective upon which the data can be viewed. In order to support an efficient processing of queries, a dimension is usually required to satisfy different classes of integrity constraints. In this paper, ...
متن کاملPerformance Evaluation of the Central Stores of Hospitals Affiliated to Tehran University of Medical Sciences in 2018
Background: Due to the presence of valuable and expensive equipments in hospitals’ warehouses, scientific management and continuous evaluation plays an important role to improve the performance of warehouses and whereby the performance of hospitals’ wards. This study aimed to evaluate the performance of the central stores of hospitals affiliated to Tehran University of Medical Sciences (TUMS). ...
متن کاملHandling Inconsistencies in Data Warehouses
Data warehouses (DWs) can become inconsistent when some dimensional constraints are not satisfied by the dimension instances. In this paper, we present preliminary results about the effects of the violation of partitioning constraints in homogeneous dimension instances over aggregation queries, and in particular over the summarizability property (SUMM) of the DWs. We are interested in finding w...
متن کاملSemi-automatic Discovery of Mappings Between Heterogeneous Data Warehouse Dimensions
Data Warehousing is the main Business Intelligence instrument for the analysis of large amounts of data. It permits the extraction of relevant information for decision making processes inside organizations. Given the great diffusion of Data Warehouses, there is an increasing need to integrate information coming from independent Data Warehouses or from independently developed data marts in the s...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
عنوان ژورنال:
- Data Knowl. Eng.
دوره 79-80 شماره
صفحات -
تاریخ انتشار 2012